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India has more than 1,600 official languages, making it a multilingual country. 
Kannada, one of the major languages, originated in the state of Karnataka and 
is currently ranked 33 among the accents that are most often spoken 
throughout the world. However, the survey shows that much more effort is 
needed to create a complete handwritten identification system. Segmentation 
is one of the crucial steps in a handwriting identification system that extracts 
significant objects from an image. The feature extraction and classification 
phases of handwritten text recognition will be more successful if the 
segmentation approaches selected are efficient. In the proposed system, 
segmentation was accomplished using bounding box and contour tracing 
methods. The result got is delivered to the next step of handwritten 
identification system. An average accuracy of 92.6% is worked out for line 
segmentation and word segmentation. 
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1. INTRODUCTION 

A computer system that can automatically decipher text from images is what text recognition in 
images aims to achieve. The abundance of beneficial automatic indexing or information retrieval applications 
for text detection and recognition in general, such as document indexing, content-based image retrieval, and 
licence plate recognition, further increases the potential for more sophisticated handwritten recognition 
systems. Handwritten text recognition system eliminates the need for manual retyping of critical documents 
when putting them into electronic databases. The processes of image pre-processing, segmentation, feature 
extraction, and character recognition are prevalent in all handwritten recognition systems. The effectiveness of 
the preceding stages has a significant adverse effect on each of these stages’ outcomes. The different writing 
styles and text distortion in handwritten papers make it challenging to segment and filter digitized documents 
based on a query. Because of the extensive database and structural complicatedness, the growth of text 
identification system for few of the Indian languages like Kannada and Telugu is anticipated a laborious process [1]. 
These challenges are further amplified by the possibility of character overlap in some instances. Despite repeated 
attempts, creating a higher precision recognition framework for all Indian languages is incredibly hard. The 
paper’s main body is organised as follows: the previous research is briefly summarized in part 2, and details 
on the suggested technique are provided in section 3. The tests and results are detailed in section 4 and section 
5 offers the conclusion and futuristic recommendations. 
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2. LITERATURE SURVEY 

Literature survey concentrates on the specified subject, critical analysis and relationship between 
various works done by different authors. Table | outlines the research that has been done so far in the area of 
handwritten text detection by numerous researchers. There have also been illustrations of the research’s 
preliminary findings. The literature overview provides a comprehensive analysis of the numerous text line 
segmentation-related challenges, which aids researchers in comprehending and advancing their work in this 


area. 


Table 1. Illustrations of the research’s preliminary findings 


Authors Techniques/algorithm used for feature Accuracy Publication 
extraction/segmenation/classification year 
Fernandes et al. [2] Tesseract tool, convolution neural network (CNN). Tesseract too 1-86%, CNN-87%. 2019 
Sushma et al. [3] Pre-processing methods-height normalization, bounding 32.5% to 75% range for every 2016 
box extraction and binarization, Feature extraction word. 
methods-SIFT and SURF. 
Gowda and Feature extraction-curviness of the characters using Accuracy for random forest-95%, 2022 
Kanchana [4] CNN, Edge-based segmentation algorithm, SVM-96% and KNN-92%. 
Categorization-SVM, KNN and Random _ Forest 
algorithms. 
Kohli and Kumar [5] To segregate touching components, have used the Segmentation accuracy 89.9%. 2021 
segmentation facilitation feature to pinpoint the junction 
path. 
Kaur et al. [6] The techniques of detecting header lines, base lines, and Accuracy depends upon the 2010 
contours. segmentation technique. 
Choudhary et al. [7] Vertical segmentation algorithm. Segmentation accuracy 83.5%. 2013 
Mello and Lacerda [8] Segmentation via feature point selection, skeletonization Recognition rate 65.79%. 2013 
and clustering using self-organizing maps. 
Mahto et al. [9] Scheme for merging horizontal and vertical projection Recognition accuracy 98.06%. 2015 
feature selection. 
Ramappa and Bounding box technique, Hough transform and contour An average segmentation rate of 2012 
Krishnamurthy [10] detection. 91% and 70% for lines and words 
is obtained. 
Thungamani and Horizontal projection profile, vertical projection profile. | Accuracy depends upon the 2012 
Kumar [11] segmentation algorithm. 
Vishwanath et al. [12] Composite feature vector retrieval utilizing gradient- The positive predictive value for 2023 
based feature descriptors, edge density filter and adaptive _ the texts of Bengali, Telugu, and 
projection profiling for segmentation; Classification- | Kannda were 74.7240%, 
SVM. 76.9728%, and 79.9518%, 
respectively. 
Obaidullah et al. [13] Bounding box, chain-code direction histogram, radon For word-level accuracy for Tri 2019 
transform, categorization using MLP and random forest. script recognition, the statistics 
are 96.76%, 95.83%, 99.03%, and 
96.60%. 
Rao et al. [14] Extended nonlinear kernel residual network. Test accuracy 97.72%. 2018 
Kavitha and Max pooling, Softmax classifier, CNN model. Training accuracy of 95.16%. 2022 
Srimathi [15] 
Ghosh et al. [16] Max-voting, probabilistic voting used for feature Accuracy of 95.04%. 2019 
extraction, wavelet transformation, CNN for training. 
Kumari and Babu [17] Segmentation is based on mathematical morphology, Accuracy of 98.7%. 2021 
CNN. 
Muppalaneni [18] CNN are incorporated into methodologies for deep As high as 79.61% for testing 2020 
learning and frameworks for machine learning. precision and 96.13% for 
precision. 
Thakral and Kumar Cluster detection technique, horizontal projection profile. | Contiguous characters segmented 2014 


[19] 


with 88% accuracy and touching, 
conjunct characters with 95% 
accuracy. 


As a result of the drawbacks described in the works, there is room for additional study and 
advancement in the pre-processing and segmentation of handwritten documents. This inspired us to create a 
system that effectively segments and recognizes handwritten Kannada documents. 


2.1. Kannada script 

Kannada is the primary language in Karnataka. From the Brahmi-descended Kadamba and Chalukya 
scripts came the development of the aksharas used in Kannada. Figure | illustrates the fundamental alphabet 
of the Kannada language, which consists of 16 vowels and 34 consonants [20], [21]. There is a vowel sign 
(modifier) for each vowel and a fundamental form for each consonant (primitive). A basic consonant can be 
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combined with a vowel sign to create a set of 16 composite consonant-vowel (CV) symbols known as the 
“gunithakshara.” All 34 consonants in Kannada have a short/half version known as “Vatthus,” sometimes 
known as half consonants or subscripts. A conjunct-consonant letter can be created by placing any half- 
consonant as a subscript character another consonant or a CV character. This work examines the pre-processing 
and segmentation of documents written by hand in Kannada. 


Figure 1. Kannada vowel and consonants samples 


3. PROPOSED METHOD 

In this paper a better method for pre-processing and segmenting handwritten Kannada documents has 
been suggested. Traditionally, the feature extraction and recognition stage in all character recognition systems 
utilizes the result of the segmentation procedure as its input. When a sample is improperly segmented, the 
recognition system is unable to identify it. However, the segmentation phase is not informed of this 
information. We have made an effort to minimize this gap between the processes of segmentation and 
recognition in our suggested strategy. The diagram shown in Figure 2 explores the stages used in the 
recognition system. To generate effective feature template from the input image at training stage consists of 
pre-processing, segmentation, relevant feature extraction and storing feature template in the knowledge base 
for classification. 


Scanning documents 
Capturing documents image Image Acquisition 


Grayscale 
Binarization 


Skew detection and removal Preprocessing 
Smoothing 


Noise removal 


Line segmentation 
Word segmentation 
Character segmentation 


Feature Extraction 


Classification 


Figure 2. Flow chart 


3.1. Data collection 

Kannada handwritten dataset has been created by different age groups of people, it includes elements 
like diverse alignments, writing styles, and character lengths, as well as aspects like pen quality, paper quality, 
ink colour and others that make the handwritten Kannada text more complicated. Every image was scanned at 
a resolution of 300 dpi. 200 of these documents have been taken into consideration for the experiment. As a 
result, handwritten text faces a greater hurdle than printed text. 
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3.2. Pre-processing 


We used the adaptive binarization strategy to finish the binarization stage of the pre-processing 
procedure as shown in Figure 3. Binarization is a process that turns a multi-tonal image into a bi-tonal one. It 
is customary to map the text pixels in the foreground of document images to black and the background of the 


image to white. To detect and fix skew, we used the hough transform, which converts an image from polar to 
Cartesian coordinates. 
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Figure 3. Results for adaptive thresholding 


3.2.1. Skew detection and correction: Hough transform 


The hough transform [22], initially developed to find patterns in an image, has indeed been refined to 
find curves in both two dimensions and three dimensions. By employing the subsequent processes, an image 
is converted into the hough space for line detection: 


—Step 1. The expression for a line in Cartesian form is: 
y=mx+b 


where, m is the line’s gradient or slope (rise/run) and b is the y-intercept. 

We are looking for as many lines in image space that connect a set of edge points or a binary image 
signifying edge. Suppose we have two edge locations (x1, y1) and (x2, y2). We figure out the corresponding b 
values for every edge point at different gradient intensities (m=-0.5, 1.0, and 1.5). Figures 4 and 5 illustrate the 
image and parameter space in detail. This shared point (m, b) symbolises the line in the image space. 


Unfortunately, when the line is vertical, the slope, m, is unquantifiable. We employ the hough space, a different 
parameter space, to get around this. 


Parameter Space 


Figure 4. Image space and parameter space Figure 5. Straight line 


— Step 2. Angle-distance parameter space in the polar coordinate system is: 


p=xcos@+ysin@ 
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where, the spacing between the source and the line is expressed as rho. [-max_dist to max_dist]. The image’s 
diagonal length is max_dist. 6 is the angle between the origin and the line [-90° to 90°]. 
—Step 3. Hough transform space 

By computing p with a point at each angle between -90° and 90°, the image is turned into the hough 
space. Peaks (p, 0) in the hough transform space are where the curves produced by collinear regions in the 
image space intersect. In image space, a line will earn more “votes” if more curves overlap at a given point. 
Figure 6 shows the results for hough lines detected. The characteristics of the input image’s most significant 
lines are shown by local maxima in the accumulator. The easiest way to find peaks is to use a threshold or a 
relative threshold. Figure 7 shows the results for skew corrected image after applying hough method. 


nput image 


Figure 6. Hough lines detected 
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Figure 7. Skew corrected image using hough transform 


3.2.2. Segmentation: method 

Segmentation consists of three steps: dilatation, contour tracing and bounding rectangles. 

— Dilation: dilation enlarges the boundaries of objects in an image by pixels. The number of times pixels are 
added to or deleted from the image’s objects are determined by the dimension and shape of the structuring 
element used to process it. 

— Contour tracing: the boundary of an object in a picture is known as a contour. To identify or classify things, 
contours are represented in a variety of ways. 

Bounding rectangles: It refers to the border’s coordinates that enclose an image. The x and y 
coordinates of the rectangle’s upper-left corner and those of its lower-right corner, which serve as a point of 
reference for word detection, are used by the bounding box algorithm to generate a fictitious rectangle. The 
algorithm 1 for contour tracing is used to identify word boundary pixels. The contour is the line that connects 
all the identically intense points along an image’s edge. It is carried out on a digital image of a word to extract 
details about its shape that will be utilized as characteristics in categorization. The under-segmentation issue 
brought on by characters overlapping can be resolved by contour-based approaches since they give a clear 
description of the shape of the characters. Additionally, since the baselines don’t need to be adjusted numerous 
times, it lowers the errors made when extracting baselines. 
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To facilitate image processing, boundary detection is performed to locate the edges in an image. There 
are many edge identification techniques available, here we have used sobel edge detection method. To find the 
midway where we can make a threshold and extract the peaks regions, we can map horizontal projections 
profile (HPP) of an image. The array of the total number of rows in a two-dimensional (2D) frame is called the 
HPP. Figure 8 shows HPP for sobel edge. Sobel edge detection is where it creates an image emphasising edges 
as shown in Figure 9. 


Algorithm 1. Contour tracing 
Step 1. Read image as grayscale 
Step 2. convert to grayscale 
Step 3. threshold=THRESH BINARY 
Step 4 set kernel size 
apply morphology 
MORPH CLOSE and MORPH ERODE 
Step 5. get largest contour 
contours=find Contours 
area_threshold=0 
for c in contours: 
area=contourArea (c) 
if area>area_threshold: 
area_threshold=area 
big contour=c 
Step 6. get bounding box 
x, y, w, h=boundingRect (big contour) 
Step 7. draw filled contour on black background 
mask=zeros like (gray) 
merge mask 
drawcontours 
Step 8. apply mask to input 
resultl=img.copy () 
resultl=bitwise and(resultl, mask) 
Step 9. crop result 
result=resultl[y:yth, x:x+tw] 
Step 10. view result 
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Figure 9. Sobel edge detection 
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In word segmentation, a bounding box is used to help the computer identify the word’s location inside 
the image. The bounding box encloses the word’s picture and indicates where it is in space. The bounding box 
strategy is used whenever a character is spotted in order to segregate the character by applying a bounding box. 
The final version of character segmentation is a clearly differentiated character. The segmentation algorithm 2 
displayed presents in detail an enhanced character segmentation that improves the detection efficiency of 
recognition system. 


Algorithm 2. Segmentation 
Input: Kannada Handwritten scanned image of size mXn 
Output: Set of segregated characters 
step 1. Pre-processing 
Convert an RGB-formatted image to a monochrome version using an adaptive 
thresholding technique to get enhanced image. 
step 2. Skew Detection and Correction 
Detect the skew angle and correct using Hough transform technique. 
step 3. Segmentation 
Apply morphological operation on image. 
step 4. Find Contours on dilated image 
step 5. Draw lines on the boundary using values obtained from the contours method. 
Bounding Rectangle is used for drawing bounding box 
step 6. Repeat step 5 for each line in the page 
step 7. Change the kernel width for word segmentation 


and repeat the same for character segmentation 


4. EXPERIMENTAL RESULTS AND DISCUSSION 

Handwritten character recognition is a contentious issue in optical character recognition applications 
and pattern categorization. Figure 10 depicts a decent number of hough lines connecting our words. The hough 
line method also gives us the angle made by the line with the origin as shown in Figure 11. 
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Figure 10. Hough lines drawn on skewed image 


Result achieved for line segmentation phase is displayed in Figure 12. Figures 13-15 exhibit instances 
of line segmented images, word segmented images and character fragmented images respectively. The 
proposed method yields an average segmentation accuracy of 92.6%. However, because there are more 
segmented characters than the total number of characters, accuracy cannot be calculated at the character level. 
This is due to the fact that the Kannada script consists of consonant modifiers that are joined with one of the 
characters to create compound characters, which are used very frequently in this language. Compound 
characters are challenging to segment and should be approached from a different angle. 
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Figure 11. Results for skew angle detection 
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Figure 12. Result achieved for line segmentation phase 
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Figure 13. Accuracy achieved in line segmentation 
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Figure 14. Word segmentation results 
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Figure 15. Character segmentation 


The proposed method is contrasted with the existing methods in Table 2. For a dataset of 50, in method 


1 three level segmentation is taken into account. Only the line and word segmentation are the focus of method 
2 and 3. In developing the proposed approach, we took into account a considerably larger data set of 200 and 
created a three-level segmentation that includes line, word, and character segmentation. 


Table 2. Assessment of the proposed method with the existing strategies 


Author Segmentation method Size of dataset | Segmentation rate 
Saleem Pasha, et. al., [23] method 1 Modified projection profile, connected component. 50 97.5% 
Alireza Alaei, et. al., [24] method 2 Potential piece-wise separation line approach 204 94.98% 
Mamatha et. al., [25] method 3 Morphological operations, projection profile 100 94.5% 
Proposed Contour tracing and bounding box 200 92.6% 
5. CONCLUSION 


One of the crucial stages of a manually written recognition system is segmentation. Several segments 


of the pre-processed image are created. This study tries to partition text into three levels: lines, words, and 
characters, using appropriate pre-processing approaches. Dilatation, contour tracing and bounding rectangles 
are applied in segmentation to identify well-separated and overlapping lines. Word and line segmentation both 
obtain an average accuracy of 92.6%. Due to the frequent occurrence of compound characters, complete 
precision cannot be obtained at the character level. Such complex characters can be difficult to segment, thus 
it is important to approach the problem from a different perspective. 
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